AITopics | disagreement rate

Collaborating Authors

disagreement rate

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An urn model for majority voting in classification ensembles

Victor Soto, Alberto Suárez, Gonzalo Martinez-Muñoz

Neural Information Processing SystemsApr-22-2026, 07:03:35 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, ensemble, machine learning, (18 more...)

Neural Information Processing Systems

Country: Europe > Spain (0.28)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

30b6fa308e62ed52180c31ae3ba6bb0a-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 17:28:03 GMT

artificial intelligence, ensemble, machine learning, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)

Add feedback

When are ensembles really effective?

Neural Information Processing SystemsDec-24-2025, 10:53:22 GMT

Ensembling has a long history in statistical data analysis, with many impactful applications. However, in many modern machine learning settings, the benefits of ensembling are less ubiquitous and less obvious. We study, both theoretically and empirically, the fundamental question of when ensembling yields significant performance improvements in classification tasks. Theoretically, we prove new results relating the \emph{ensemble improvement rate} (a measure of how much ensembling decreases the error rate versus a single model, on a relative scale) to the \emph{disagreement-error ratio}. We show that ensembling improves performance significantly whenever the disagreement rate is large relative to the average error rate; and that, conversely, one classifier is often enough whenever the disagreement rate is low relative to the average error rate. On the way to proving these results, we derive, under a mild condition called \emph{competence}, improved upper and lower bounds on the average test error rate of the majority vote classifier.To complement this theory, we study ensembling empirically in a variety of settings, verifying the predictions made by our theory, and identifying practical scenarios where ensembling does and does not result in large performance improvements. Perhaps most notably, we demonstrate a distinct difference in behavior between interpolating models (popular in current practice) and non-interpolating models (such as tree-based methods, where ensembling is popular), demonstrating that ensembling helps considerably more in the latter case than in the former.

error rate, name change, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Reliably Detecting Model Failures in Deployment Without Labels

Nguyen, Viet, Shui, Changjian, Giri, Vijay, Arya, Siddharth, Verma, Amol, Razak, Fahad, Krishnan, Rahul G.

arXiv.org Artificial IntelligenceNov-5-2025

The distribution of data changes over time; models operating in dynamic environments need retraining. But knowing when to retrain, without access to labels, is an open challenge since some, but not all shifts degrade model performance. This paper formalizes and addresses the problem of post-deployment deterioration (PDD) monitoring. We propose D3M, a practical and efficient monitoring algorithm based on the disagreement of predictive models, achieving low false positive rates under non-deteriorating shifts and provides sample complexity bounds for high true positive rates under deteriorating shifts. Empirical results on both standard benchmark and a real-world large-scale internal medicine dataset demonstrate the effectiveness of the framework and highlight its viability as an alert mechanism for high-stakes machine learning pipelines.

artificial intelligence, disagreement rate, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2506.05047

Country: North America > Canada > Ontario (0.28)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.94)
Health & Medicine > Diagnostic Medicine (0.92)
Health & Medicine > Therapeutic Area > Endocrinology (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

30b6fa308e62ed52180c31ae3ba6bb0a-Paper-Conference.pdf

Neural Information Processing SystemsOct-8-2025, 09:37:01 GMT

classifier, ensemble, error rate, (13 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Alameda County > Berkeley (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.47)

Add feedback

When are ensembles really effective?

Neural Information Processing SystemsOct-10-2024, 23:14:33 GMT

average error rate, error rate, performance improvement, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

AdaRank: Disagreement Based Module Rank Prediction for Low-rank Adaptation

Dong, Yihe

arXiv.org Artificial IntelligenceAug-16-2024

With the rise of language and multimodal models of ever-increasing size, pretraining a general-purpose foundational model and adapting it to downstream tasks has become common practice. To this end, adaptation efficiency can be a critical bottleneck given the large model sizes, hence efficient finetuning methods such as LoRA have become prevalent. However, LoRA is typically applied with the same rank across all model layers, despite mounting evidence from transfer learning literature that during finetuning, later layers diverge more from pretrained weights. Inspired by the theory and observations around feature learning and module criticality, we develop a simple model disagreement based technique to predict the rank of a given module relative to the other modules. Empirically, AdaRank generalizes notably better on unseen data than using uniform ranks with the same number of parameters. Compared to prior work, AdaRank has the unique advantage of leaving the pretraining and adaptation stages completely intact: no need for any additional objectives or regularizers, which can hinder adaptation accuracy and performance. Our code is publicly available at https://github.com/google-research/google-research/tree/master/adaptive_low_rank.

adaptation, adarank, module, (17 more...)

arXiv.org Artificial Intelligence

2408.09015

Country: Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Domain Shift Analysis in Chest Radiographs Classification in a Veterans Healthcare Administration Population

Chandrashekar, Mayanka, Goethert, Ian, Haque, Md Inzamam Ul, McMahon, Benjamin, Dhaubhadel, Sayera, Knight, Kathryn, Erdos, Joseph, Reagan, Donna, Taylor, Caroline, Kuzmak, Peter, Gaziano, John Michael, McAllister, Eileen, Costa, Lauren, Ho, Yuk-Lam, Cho, Kelly, Tamang, Suzanne, Fodeh-Jarad, Samah, Ovchinnikova, Olga S., Justice, Amy C., Hinkle, Jacob, Danciu, Ioana

arXiv.org Artificial IntelligenceJul-30-2024

Objectives: This study aims to assess the impact of domain shift on chest X-ray classification accuracy and to analyze the influence of ground truth label quality and demographic factors such as age group, sex, and study year. Materials and Methods: We used a DenseNet121 model pretrained MIMIC-CXR dataset for deep learning-based multilabel classification using ground truth labels from radiology reports extracted using the CheXpert and CheXbert Labeler. We compared the performance of the 14 chest X-ray labels on the MIMIC-CXR and Veterans Healthcare Administration chest X-ray dataset (VA-CXR). The VA-CXR dataset comprises over 259k chest X-ray images spanning between the years 2010 and 2022. Results: The validation of ground truth and the assessment of multi-label classification performance across various NLP extraction tools revealed that the VA-CXR dataset exhibited lower disagreement rates than the MIMIC-CXR datasets. Additionally, there were notable differences in AUC scores between models utilizing CheXpert and CheXbert. When evaluating multi-label classification performance across different datasets, minimal domain shift was observed in unseen datasets, except for the label "Enlarged Cardiomediastinum." The study year's subgroup analyses exhibited the most significant variations in multi-label classification model performance. These findings underscore the importance of considering domain shifts in chest X-ray classification tasks, particularly concerning study years. Conclusion: Our study reveals the significant impact of domain shift and demographic factors on chest X-ray classification, emphasizing the need for improved transfer learning and equitable model development. Addressing these challenges is crucial for advancing medical imaging and enhancing patient care.

dataset, mimic-cxr, va-cxr, (15 more...)

arXiv.org Artificial Intelligence

2407.21149

Country:

North America > United States > New Mexico > Los Alamos County > Los Alamos (0.04)
Asia > Middle East > Israel (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

An urn model for majority voting in classification ensembles

Neural Information Processing SystemsMar-12-2024, 17:58:16 GMT

In this work we analyze the class prediction of parallel randomized ensembles by majority voting as an urn model. For a given test instance, the ensemble can be viewed as an urn of marbles of different colors. A marble represents an individual classifier. Its color represents the class label prediction of the corresponding classifier. The sequential querying of classifiers in the ensemble can be seen as draws without replacement from the urn.

classifier, ensemble, prediction, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > Spain > Galicia > Madrid (0.04)
South America > Paraguay > Asunción > Asunción (0.04)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.46)

Add feedback

Assessing generalization of SGD via disagreement

AIHubMar-21-2022, 10:22:26 GMT

Imagine training a deep network twice with two different random seeds on the same data, and then measuring the rate at which they disagree on unlabeled test points. Naively, they can disagree with one another with probability anywhere between zero and twice the error rate. But surprisingly, in practice, we observe that the disagreement and test error of deep neural network are remarkably close to each other. The variable refers to the average generalization error of the two models and the variable refers to the disagreement of the two models. Estimating the generalization error of a model -- how well the model performs on unseen data -- is a fundamental component in any machine learning system. Generalization performance is traditionally estimated in a supervised manner, by dividing the labeled data into a training set and test set.

disagreement, disagreement rate, generalization error, (14 more...)

AIHub

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.37)

Add feedback